Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Telemetry] Validate ES and SO clients status before fetching telemetry #136748

Conversation

afharo
Copy link
Member

@afharo afharo commented Jul 20, 2022

Summary

Fetching the Snapshot telemetry report can be very noisy in the logs if ES or the SO services are done. With so many collectors relying on those clients, they suddenly log many connection errors.

This PR checks the status of those services before generating the Snapshot Telemetry report.

Resolves #97788
Resolves #89588

Checklist

Risk Matrix

Risk Probability Severity Mitigation/Notes
If a cluster is in a faulty state, becoming not available due to ES connection issues, we may never report any telemetry about that deployment. Low High The snapshot data will likely be incomplete anyways. And we still have some EBT reporting the status changes.

For maintainers

@afharo afharo added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes backport:skip This commit does not require backporting v8.4.0 labels Jul 20, 2022
@afharo afharo requested a review from a team as a code owner July 20, 2022 16:34
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@afharo afharo enabled auto-merge (squash) July 20, 2022 16:50
@afharo afharo merged commit 7df8532 into elastic:main Jul 20, 2022
@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
telemetryCollectionManager 32 26 -6
Unknown metric groups

API count

id before after diff
telemetryCollectionManager 32 31 -1

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@afharo afharo deleted the telemetry/validate-ES-and-SO-statuses-before-fetching-telemetry branch July 20, 2022 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v8.4.0
Projects
None yet
4 participants